In order to understand topics and words used by soldiers in their textual responses, we used bigrams and co-occurrences to create networks. Bigrams are sets of words that are used immediately next to eachother in a response. For example in the sentence "I run with my dog", "run with" is a bigram. For co-occurrences, the words are correlated with eachother on the response level. For example, in the same sentence, "run" and "dog" are co-occurring even though they do not occur next to eachother in the sentence. We used these two separate levels of analysis to create text networks to understand the words soldiers were using and how they fit together in a larger network of verbage.
We created bigrams for each group of soldiers for each set of responses. From these bigrams we can visualize words that are used in tandem with each other. This is important for understanding compound terms, and is used later on in our analysis when we unionize terms.
In order to visualize co-occurrence text networks we utilized gephi, an open-source tool developed for network visualizations. From this we can visulaize and compare words and topics discussed by black and white soldiers in their long responses.
Black soldiers commonly discussed
3 Social Networks with Unionized Terminology
Something that is important to us is soldiers' dicussions of inner-outer groups of people. A way that we decided to look at that was by unionizing biterms. For example, a naive co-occurence with "black" may be "people" but we care about the dicussion of "black people" rather than just the identification of "people" as co-occurring with the word "black". To do this we complete several unionizations of biterms to create co-occurrence networks of dicussions of groups of people.
3.1 Long Responses
We complete unionized term co-occurences and social networks using long response textual data. We separate our analysis by race and report co-occurences and co-occurence networks for both black and white soldiers.
3.2 Short Responses
We complete the same unionized-analysis above but using only short-response data from white soldiers. We are unable to get enough data to create plots for the two different groups of white soldiers: pro-segregation and anti-segregation. The following analysis reflects terms used in the entire group of white soldiers.